{
 "cells": [
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Benchmarking tips\n",
    "\n",
    "In the `Julia is fast` notebook, we saw the package `BenchmarkTools` and used its `@benchmark` macro.\n",
    "\n",
    "In this notebook, we'll explore the importance of \"interpolating\" global variables when benchmarking functions.\n",
    "\n",
    "We interpolate a global variable by throwing a `$` in front of it. For example, in `Julia is fast`, we benchmarked the `sum` function using `Vector` `A` via\n",
    "\n",
    "```julia\n",
    "@benchmark sum($A)\n",
    "```\n",
    "\n",
    "not\n",
    "\n",
    "```julia\n",
    "@benchmark sum(A)\n",
    "```\n",
    "\n",
    "Let's see if this can make a difference by examining the ratio in execution times of `sum($A)` and `sum(A)` for differently sized arrays `A`. \n",
    "\n",
    "#### Exercise\n",
    "\n",
    "Call the `sum` function on a pseudo-randomly populated 1D array called `foo` of several lengths between 2 and 2^20 (~10^6). For each size of `foo`, determine the ratio of execution times for `sum(foo)` and `sum($foo)`. (To determine this ratio, use the minimum run times in each case.)\n",
    "\n",
    "Plot the ratio of execution times for non-interpolated and interpolated `foo` in calls to `sum` versus the length of `foo`. Does interpolating `foo` seem to matter? If so, for what sizes of `foo`?\n",
    "\n",
    "#### Solution\n",
    "\n"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "using BenchmarkTools, Plots"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "gr()"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "store_ratios = []\n",
    "foo = rand()\n",
    "for i in 1:20\n",
    "    foo = rand(2^i) \n",
    "    not_interpolated = @benchmark sum(foo)\n",
    "    interpolated = @benchmark sum($foo)\n",
    "    min_not_interpolated = minimum(not_interpolated.times)\n",
    "    min_interpolated = minimum(interpolated.times)\n",
    "    ratio = min_not_interpolated / min_interpolated\n",
    "    append!(store_ratios, ratio)\n",
    "end"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "plot([i for i in 1:20], store_ratios, legend=false)\n",
    "scatter!([i for i in 1:20], store_ratios)\n",
    "xlabel!(\"The log of the length of `foo`\")\n",
    "ylabel!(\"Factor of performance increase\")\n",
    "title!(\"Performance increase from interpolating a global variable\")"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "In this case, interpolating the global variable is more important when that global variable is bound to a relatively small array. For arrays with a few thousand elements and more, interpolation doesn't seem to have an impact."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Performance tips -- type stability\n",
    "\n",
    "One way to optimize code in Julia is to ensure **type stability**. If the type(s) of some variables in a function are subject to change or ambiguity, the compiler cannot reason as well about those variables, and performance will take a hit. Conversely, we allow the compiler to optimize and generate more efficient machine code when we declare variables so that their types will be fixed throughout the function body.\n",
    "\n",
    "For example, let's say we had functions called `baz` and `bar` with the following definitions\n",
    "\n",
    "```julia\n",
    "function baz()\n",
    "    s = rand()\n",
    "    if s > 2/3\n",
    "        return .666667\n",
    "    elseif s > 1/3\n",
    "        return 1//3    \n",
    "    else\n",
    "        return 0    \n",
    "    end\n",
    "end\n",
    "```\n",
    "\n",
    "```julia\n",
    "function bar()\n",
    "    s = rand()\n",
    "    if s > 2/3\n",
    "        return .666667\n",
    "    elseif s > 1/3\n",
    "        return .3333333    \n",
    "    else\n",
    "        return 0.0    \n",
    "    end\n",
    "end\n",
    "```\n",
    "\n",
    "When I benchmark these via\n",
    "\n",
    "```julia\n",
    "using BenchmarkTools\n",
    "@benchmark baz()\n",
    "@benchmark bar()\n",
    "```\n",
    "\n",
    "I see that `bar` is almost three times as fast as `baz`! The reason is that `bar` is type stable while `baz` is not: the compiler can tell that `bar` will always return a `Float`, whereas `baz` could return a `Float`, an `Int`, or a `Rational`. When the compiler can tell what the types of outputs from a function, or variables declared *within a function* are without running the code, it can do much better.\n",
    "\n",
    "#### Exercise\n",
    "\n",
    "The following definition for `my_sum` is not type stable. \n",
    "\n",
    "```julia\n",
    "function my_sum(A)\n",
    "    output = 0\n",
    "    for x in A\n",
    "        output += x\n",
    "    end\n",
    "    return output\n",
    "end\n",
    "```\n",
    "\n",
    "Copy and execute the above code into a new cell. Benchmark it using `A = rand(10^3)`. Then write a new function called `my_sum2` with the same function body as `my_sum`. Update `my_sum2` to make it type stable, and benchmark it for a randomly populated array with 10^3 entries.\n",
    "\n",
    "How much does type stability impact performance? If you'd like, try this same exercise for multiple sizes of `A` to see if this changes your answer!\n",
    "\n",
    "#### Solution"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "function my_sum(A)\n",
    "    output = 0\n",
    "    for x in A\n",
    "        output += x\n",
    "    end\n",
    "    return output\n",
    "end"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "using BenchmarkTools\n",
    "\n",
    "A = rand(10^3)\n",
    "@benchmark my_sum($A)\n"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "function my_sum2(A)\n",
    "    output = zero(eltype(A))\n",
    "    # or:\n",
    "    # output = 0.0\n",
    "    for x in A\n",
    "        output += x\n",
    "    end\n",
    "    return output\n",
    "end"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "@benchmark my_sum2($A)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "In my run, `my_sum2` outperformed `my_sum` by a factor of over 20 for an array with 1000 elements!"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "#### Exercise\n",
    "\n",
    "Make the following code type stable. You'll know your efforts are paying off when you see a performance boost!\n",
    "\n",
    "```julia\n",
    "\"\"\"\n",
    "    my_sqrt(x)\n",
    "    \n",
    "Calculate the square root of `x` with Newton's method.\n",
    "\"\"\"\n",
    "function my_sqrt(x)\n",
    "    output = 1\n",
    "    for i in 1:1000\n",
    "        output = .5 * (output + x/output)\n",
    "    end\n",
    "    output\n",
    "end\n",
    "```"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "#### Solution\n",
    "\n",
    "You need to make sure `ouptut`'s type never changes in order to make this function type stable!"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "# The type unstable version\n",
    "function my_sqrt(x)\n",
    "    output = 1\n",
    "    for i in 1:1000\n",
    "        output = .5 * (output + x/output)\n",
    "    end\n",
    "    output\n",
    "end"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "@benchmark my_sqrt(100)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "# The type stable version, where `output` is always a floating point number.\n",
    "function my_sqrt(x)\n",
    "    output = 1.0\n",
    "    for i in 1:1000\n",
    "        output = .5 * (output + x/output)\n",
    "    end\n",
    "    output\n",
    "end"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "@benchmark my_sqrt(100)"
   ]
  }
 ],
 "metadata": {
  "kernelspec": {
   "display_name": "Julia 1.0.0",
   "language": "julia",
   "name": "julia-1.0"
  },
  "language_info": {
   "file_extension": ".jl",
   "mimetype": "application/julia",
   "name": "julia",
   "version": "1.0.0"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 2
}
